Condensed Matter Physics, 200?, Vol. ?, No ?, pp.EEl 



o 
o 

(N 

a 



O 



> 



Scaling in public transport networks 

C. von FerbeiEI, Yu. HolovatcfPEl, V. Palchyko\j3 

Theoretische Polymerphysik, Universitat Freiburg, D-79104 Freiburg, Germany 

^ Institute for Condensed Matter Physics, National Acad. Sci. of Ukraine, 
Ph : UA-79011 Lviv, Ukraine 

^ Institut fiir Theoretische Physik, Johannes Kepler Universitat Linz, 
^ '. A-4040 Linz, Austria 

^ Ivan Franko National University of Lviv, UA-79005 Lviv, Ukraine 

> ■ 

February 2, 2008 

^ ' We analyse the statistical properties of public transport networks. These net- 

works are defined by a set of public transport routes (bus lines) and the stations 
serviced by these. For larger networks these appear to possess a scale-free struc- 
ture, as it is demonstrated e.g. by the Zipf law distribution of the number of 
routes servicing a given station or for the distribution of the number of stations 
\^ . which can be visited from the chosen one without changing the means of trans- 

om ' port. Moreover, a rather particular feature of the public transport network is 

■ that many routes service common subsets of stations. We discuss the possibility 

' of new scaling laws that govern intrinsic features of such subsets. 

)Q '. PACS: 89.75.-k, 89.75. Da, 05.65.+b 

o 

^ ! 1. Introduction. What are complex networks for a physicist 

^ \ Complex networks have only recently become a subject discussed on the pages 

of physical journals [Q]. However, currently the statistical mechanics of complex 
networks is an important and quickly evolving field of physics ["^j as one can check 
e.g. making a search in the WWW (the latter being another huge complex network 
I and hence by itself an important subject of study). As has been worked out in the 

I meantime, complex web-like structures are involved in such different systems as the 

already mentioned WWW (with its documents as nodes and links as edges) [E], 
the metabolism of a biological cell (substrates connected by bio-chemical reactions) 
[E], social communications (human beings connected by various relationships) [ 6J, 
ecological systems (food webs joining different species) [1^, etc. Therefore, these and 
similar systems can be formally described in terms of the same formalism and very 
often they manifest similar statistical behaviour. 

Of particular interest for our study is the question if the network displays scale- 
free properties: a notion introduced to characterize a network which does not posses 
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a typical scale [O]. A network is called scale-free when its node degree distribution, 
i.e. the probability that a randomly selected node has k edges, possesses a power-law 
tail: 

P{k) ~ k-\ (1) 

Since its first observation by G.K. Zipf in quantitative linguistics [jHj the power 
law is referred to as Zipfs law. For a physicist working in the field of statistical 
physics, appearance of universal power laws - scaling laws - serves as a manifestation 
of collective behaviour of a many-particle system at the critical point [IHj. This 
explains, in particular, why physicists (often those, working in a field of critical 
phenomena) apply their efforts and skills in the network theory and why these 
efforts may be fruitful. 

The last decades of the past century offered theoretical descriptions of critical 
phenomena which show why universal scaling laws (PJ) emerge and give reliable nu- 
merical estimates for the exponents 7 governing the scaling of different physical 
observables [lIDj. This description was made possible by the application of field the- 
oretical methods in many particle physics [E] and serves as a background to explain 
and to predict scaling in various systems. However, the modern theory of networks 
differs from the modern theory of critical phenomena in a way that although it op- 
erates with models describing different types of networks and looks for the Zipf laws 
governing the scaling of the properties of these networks, the prediction or explana- 
tion of certain scaling properties is done mainly via computer simulations, or a simple 
(mean-field like) analysis. A theoretical description of complex networks, involving 
e.g. a theory beyond the tree graph approximation similar to the field-theoretical 
description of critical phenomena with non-trivial interactions [ 13 El E] is still 
missing. 

At this level of network analysis it is important both to search for new types of 
networks that exist in complex structures as well as to collect "empirical data" : to 
look for observables describing these networks and to analyse their properties. In our 
paper, we want to attract attention to a feature frequently encountered in complex 
networks: Looking at the paths of connections on the "motherboard" of a computer, 
the wiring in a car, or even the neural connections along the spine of vertebrates one 
observes as a common feature that the physical paths used by the lines connecting 
different nodes are often shared by many other lines connecting other nodes. We are 
interested in the distribution of the load along these paths. We study this property 
for more easily accessible examples of networks of public transport (PT networks). 
We will show that a PT network may demonstrate a scale-free behaviour. Moreover, 
a rather particular feature of the PT network is that many routes possess common 
subsets of stations. We will demonstrate that new scaling laws may govern intrinsic 
features of distributions defined on these subsets. 

The rest of the paper is organized as follows: in the next section |21 we explain 
what we mean by a PT network, introduce the observables used to describe it and 
give the sources of our further analysis. In section |21 we will show that the node 
degree distribution of PT networks obeys the Zipf law (P) and hence a PT network 
may constitute an example of a scale-free network [ 3j. In section|3]we will consider in 
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Figure 1. A part of the public transport scheme for Paris and a PT network, 
which corresponds to it. 

particular the traffic load distribution of paths of different length studying situations 
where many routes possess common subsets of nodes. We check these properties for 
scale-free behaviour and speculate on a generalization. Conclusions and an outlook 
are given in section |31 

2. A PT network 

In our study we consider networks of pubhc transport (buses, trams, and sub- 
ways), called hereafter PT networks. In a PT network, the nodes are the stations of 
public transport and the edges are the links connecting them along the route (see fig- 
ure HJ. We will be interested in various characteristics of PT networks that describe 
statistical properties of node-link distributions. The examples are given below. We 
will perform our study according to the following scheme: 

a) choose a PT network; 

b) make an ordered list of stations visited by each line; 

c) check the network statistics. 

Let us comment on each of the above items before giving results about the network 
structure. 

a) As usual, to allow general properties of a network structure to manifest them- 
selves, the network analysed should be large enough in terms of numbers of nodes 
and edges. Therefore, we choose the PT networks of big cities, having large numbers 
of routes and stops. The results presented below are based on an analysis of PT 
networks of Berlin (198 routes and 2952 stations), Diisseldorf (124 and 1615) and 
Paris (232 and 4003). In an extension of our present study which is under way we 
analyse the PT networks of more and larger cities. 

b) The schedules of public transport for the above cities were downloaded from 
the internet [^] and brought into appropriate format to construct the ordered lists 
of stations serviced by each line (hereafter we do not make any difference between 
bus/tram/subway PT lines). These serve as a background for the network structure 
analysis. 

c) The ordered lists of stations allow to perform some statistics checking typical 
quantities describing the network. The most simple one concerns the number of 
lines that service a given station [ and the number of other stations one may 
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Figure 2. Number of stations as a function of the node degree k for the PT net- 
works of Berhn (squares), Diisseldorf (crosses), Paris (circles). Being normahzed, 
this function gives the node degree distribution P{k) (pQ). 

reach without changing the bus/tram from a given station. The distribution of the 
first quantity gives the famihar node degree distribution P{k) introduced in the 
previous section. The choice of the second quantity comes about as a node degree 
distribution of a conjugated network where each station (node) is connected with 
all other stations for which there is a route servicing both. This has quite practical 
consequences: it describes the neighbourhood of given station and hence its " utility" . 
Below, we will denote this the size of this neighbourhood by Zi. 

A lot of different characteristics of the network can be introduced depending on 
the particular problem one is interested in. Here, we want to attract attention to a 
particular feature of a PT network: often, a sequence of nodes is joined by more than 
one line. This is the familiar situation when one can go from one station to another 
by different train or bus lines without making a change. To study the distribution 
of such sequences of stations, let us introduce the quantity P{L, N): the number of 
node segments of length L connected by lines. 

Results of numerical analysis of the above introduced quantities P{k) and P{L, N) 
will be presented in the next two sections. 

3. Scale-free behaviour 

First, we examine the node degree distribution P{k) of the PT networks. Results 
are shown in the figure El where we plot the number of stations for the PT networks 
of Berlin, Diisseldorf, and Paris as a function of the routes going through them 
(a node degree k [E])- Being normalized by the overall number of stations, this 
function gives the node degree distribution P{k), hence both quantities obey the 
same scaling. One observes a power-law behaviour in figure El which leads to the 
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conclusion that the PT networks may be scale-free. The least squares fit for all data 
points of each city gives: Berlin: 2.90; Diisseldorf: 2.45; Paris: 2.94. The values of the 
two larger networks are close to 7 = 3 corresponding to the preferential attachment 
scenario [12]. 

Another network property analysed here is the size of the direct neighbour- 
hood of a given station introduced in the section |21 the number of stations one may 
reach without changing from that station. By definition, for a given station this 
quantity is obtained by counting all stations Zx which belong to the routes cross- 
ing it. We show the number Ni^Z^ > M) of stations with a neighbourhood larger 
than M as function of M in figure El This function corresponding to an integrated 
distribution is by definition monotonous and thus smoother than the distribution 
of Zi itself. As is seen from the double logarithmic and log-linear plots only the 
largest (Paris) network develops a clear power law tail with an exponent for the 
integrated distribution of about 7 — 1 = 2.7. The Duesseldorf network may also be 
approximated by two exponentials. 




Figure 3. Integrated distribution of the direct neighbourhood Zi of a given sta- 
tion. Number of stations with Zi > M for PT networks of Berlin (squares), 
Diisseldorf (crosses), Paris (circles). Left: double-logarithmic, right: log-linear 
plot. 

Recently, another PT network property was reported to possess universal scaling 
behaviour: it was shown that the mean distance between nodes of different degrees 
is governed by a scaling law [ . 

4. Segment distributions 

As we have already seen in the previous section, large PT networks may have 
scale-free properties as shown by the power-law behaviour of their node-degree distri- 
bution. Our next step will be to study some other characteristics of the PT networks 
and to check them for a power-law behaviour. Another example was given in the 
previous section by the neighbourhood size Zi. Here, we will continue the analysis 
for the values P{L,N), as introduced in section |21 a number of node segments of 
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L 


Berlin 


Diiss. 


Paris 


1 


3,47 


3,09 


3,97 


2 


3,58 


3,58 


4,59 


3 


3,77 


4,08 


4,95 


4 


4,55 


3,82 


6,06 


5 


4,75 


3,9 


5,9 



Table 1. Scaling exponent 71, obtained by the least-square fit for the number 
of sequences of length L connected by N lines P{L, N) for the PT networks of 
Berlin, Diisseldorf and Paris. 

length L connected by lines. Our interest in these values is caused by the fact, that 
for a real "physical" network different numbers of links between nodes correspond to 
different loads on the connection between these nodes. Besides the PT network an 
example may be given by the network of wires connecting complex electric circuit, 
or a network of tubes transmitting a fluid etc. In these cases, the information about 
the loads on the links and their distribution is important not only for understanding 
the entire network features, but also for optimizing its structure. 

First, we analyse P{1,N): the number of segments of length 1 consisting of 
lines. From now on we will calculate the integral characteristics, that is calculating 
A^) we take into account all sequences of stations of length 1 consisting of N 
lines. The result is shown in figure lUa. Again, a power law holds: 

P{l,N)r^N-^\ (2) 

with the exponent 71 ranging from 3 to 4 for the networks considered. The least 
squares fit gives the values given in table [T] Obviously, the power-law behaviour 
found for P{1,N) should not hold for all values of L and A^. Indeed, the other 
limiting case P{L, 1) describes distribution of lengths of different routes. Function 
P{L, 1) will have a maximum corresponding to the mean length of route, provided 
the network has enough routes for a good statistics. The behaviour of the function 
P{L, N) with increasing L is shown by figures 0] It is tempting to describe the plots 
given in the figures by power-laws with exponents 'Jl, depending on the line length 
L: 

P{L,N)r^N-^K (3) 

Numerical values of the corresponding exponents 7^, are given in the table Q for 
L = 1...5. The columns of the table give results obtained by the least-squares fit for 
each city separately. The data of figures |3]a-f seem obey power laws, but obviously 
it is too early to state this as a definite conclusion at least for two reasons. The 
first obvious reason is that the statistics considered so far concerns three different 
networks only (PT networks of three different cities). Although the networks them- 
selves were chosen to be large enough (see section |2I), still it is desirable to support 
the data obtained by analysis of a larger number of networks. The second reason is 
more subtle: it is obvious, that P{L, N) decreases with A^ for the (real) PT networks 
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Figure 4. Number of sequences P{L,N) of length L connected by lines, a: 
L = 1, b: L = 2, c: L = 3, d: L = 4 e: L = 5, f: L = 6. Symbols as in the figure 
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considered here. Indeed, the larger the the smaller P{N) and the smaller the 
probability that several (L) subsequent nodes have the same node degree. Therefore, 
the number of data points for P{L, N) always will decrease with L and leading to 
poor statistics for any real network: c.f. number of data points in figures|3]a and|31f. 
So the data for 'Jl presented in the tabled for each separate network networks are 
to be considered rather as an attempt for a power law fit, and not as a definite con- 
clusion about the universal power-law distribution. Nevertheless, we consider this 
observation to be interesting for further study. 

5. Conclusions, outlook, and . . . best wishes! 

This paper is written for a special issue of CMP in honour of Reinhard Folk 60th 
birthday. The majority of the contributions to the Festschrift reflect the honorees 
field of activity: phase transitions in condensed matter physics. In our introduction 
we tried to show a possible link between complex network behaviour and criticality 
in condensed matter provided by the scaling phenomenon, as we, being new to 
the fascinating field of complex networks understand it. By this paper we want to 
congratulate Reinhard Folk on the occasion of his birthday and to acknowledge his 
vivid interest and active participation in the numerous discussions about complex 
networks during the great time two of us (CvF and YuH) had enjoying his wonderful 
hospitality in Wintersemester 2004 in Linz. 

The PT networks discussed in this paper provide one more example of the scale- 
free networks, as demonstrated by the power law behaviour of their node degree 
distribution (figure El). Besides, some other properties of these networks may obey 
scaling laws. An example was given by the integrated distribution of the neigh- 
bourhood size Zi describing the number of stations which can be serviced from the 
chosen one without making a change. We found evidence that the functions P{L, N) 
- the number of node segments of length L connected by lines - may be described 
by power-law fits Q at least for low L. Our interest in these values is caused by 
the fact, that they are important to understand specific properties of PT networks, 
where many routes possess common subsets of nodes and other examples of similar 
structure that were briefly mentioned in the paper as well. 

A natural continuation of this study is to improve the network statistics consid- 
ered here, by taking a larger number of PT networks as well as to continue analysis 
of different network characteristics. 

One of us (YuH) acknowledges the Austrian Fonds zur Forderung der wissen- 
schaftlichen Forschung for support under project No. 16574-PHY. 
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